Unsupervised speaker diarization using riemannian manifold clustering
نویسندگان
چکیده
We address the problem of speaker clustering for robust unsupervised speaker diarization. We model each speakerhomogeneous segment as one single full multivariate Gaussian probability density function (pdf) and take into consideration the Riemannian property of Gaussian pdfs. By assuming that segments from different speakers lie on different (possibly intersected) sub-manifolds of the manifold of Gaussian pdfs, we formulate the original problem as a Riemannian manifold clustering problem. To apply the computationally simple Riemannian locally linear embedding (LLE) algorithm, we impose a constraint on the length of each segment so as to ensure the fitness of single-Gaussian modeling and to increase the chance that all k-nearest neighbors of a pdf are from the same submanifold (speaker). Experiments on the microphone-recorded conversational interviews from NIST 2010 speaker recognition evaluation set demonstrate promising results of less than 1% DER.
منابع مشابه
Online Diarization of Telephone Conversations
Speaker diarization systems attempts to perform segmentation and labeling of a conversation between R speakers, while no prior information is given regarding the conversation. Diarization systems basically tries to answer the question ”Who spoke when?”. In order to perform speaker diarization, most state of the art diarization systems operate in an off-line mode, that is, all of the samples of ...
متن کاملTowards a Better Integration of Written Names for Unsupervised Speakers Identification in Videos
Existing methods for unsupervised identification of speakers in TV broadcast usually rely on the output of a speaker diarization module and try to name each cluster using names provided by another source of information: we call it “late naming”. Hence, written names extracted from title blocks tend to lead to high precision identification, although they cannot correct errors made during the clu...
متن کاملOn the Use of Spectral and Iterative Methods for Speaker Diarization
This paper extends upon our previous work using i-vectors for speaker diarization. We examine the effectiveness of spectral clustering as an alternative to our previous approach using Kmeans clustering and adapt a previously-used heuristic to estimate the number of speakers. Additionally, we consider an iterative optimization scheme and experiment with its ability to improve both cluster assign...
متن کاملUnsupervised Methods for Speaker Diarization
Given a stream of unlabeled audio data, speaker diarization is the process of determining “who spoke when.” We propose a novel approach to solving this problem by taking advantage of the effectiveness of factor analysis as a front-end for extracting speaker-specific features and exploiting the inherent variabilities in the data through the use of unsupervised methods. Upon initial evaluation, o...
متن کاملDomain Adaptation of PLDA Models in Broadcast Diarization by Means of Unsupervised Speaker Clustering
This work presents a new strategy to perform diarization dealing with high variability data, such as multimedia information in broadcast. This variability is highly noticeable among domains (inter-domain variability among chapters, shows, genres, etc.). Therefore, each domain requires its own specific model to obtain the optimal results. We propose to adapt the PLDA models of our diarization sy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014